Catalogo dei prodotti della ricerca

The design of activation functions is a growing research area in the field of neural networks. In particular, instead of using fixed point-wise functions (e.g., the rectified linear unit), several authors have proposed ways of learning these functions directly from the data in a non-parametric fashion. In this paper we focus on the kernel activation function (KAF), a recently proposed framework wherein each function is modeled as a one-dimensional kernel model, whose weights are adapted through standard backpropagation-based optimization. One drawback of KAFs is the need to select a single kernel function and its eventual hyper-parameters. To partially overcome this problem, we motivate an extension of the KAF model, in which multiple kernels are linearly combined at every neuron, inspired by the literature on multiple kernel learning. We provide an application of the resulting multi-KAF on a realistic use case, specifically handwritten Latin OCR, on a large dataset collected in the context of the ‘In Codice Ratio’ project. Results show that multi-KAFs can improve the accuracy of the convolutional networks previously developed for the task, with faster convergence, even with a smaller number of overall parameters.

Multikernel Activation Functions: Formulation and a Case Study / Scardapane, Simone; Nieddu, Elena; Firmani, Donatella; Merialdo, Paolo. - (2020), pp. 320-329. (Intervento presentato al convegno 2019 INNS Big Data and Deep Learning conference tenutosi a Sestri Levante; Italy) [10.1007/978-3-030-16841-4_33].

Multikernel Activation Functions: Formulation and a Case Study

Scardapane, Simone;Nieddu, Elena;Firmani, Donatella;Merialdo, Paolo

2020

Abstract

The design of activation functions is a growing research area in the field of neural networks. In particular, instead of using fixed point-wise functions (e.g., the rectified linear unit), several authors have proposed ways of learning these functions directly from the data in a non-parametric fashion. In this paper we focus on the kernel activation function (KAF), a recently proposed framework wherein each function is modeled as a one-dimensional kernel model, whose weights are adapted through standard backpropagation-based optimization. One drawback of KAFs is the need to select a single kernel function and its eventual hyper-parameters. To partially overcome this problem, we motivate an extension of the KAF model, in which multiple kernels are linearly combined at every neuron, inspired by the literature on multiple kernel learning. We provide an application of the resulting multi-KAF on a realistic use case, specifically handwritten Latin OCR, on a large dataset collected in the context of the ‘In Codice Ratio’ project. Results show that multi-KAFs can improve the accuracy of the convolutional networks previously developed for the task, with faster convergence, even with a smaller number of overall parameters.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2020
			
	Nome convegno
	
				2019  INNS Big Data and Deep Learning conference
			
	Parole chiave
	
				activation function; multikernel; OCR; latin
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Multikernel Activation Functions: Formulation and a Case Study / Scardapane, Simone; Nieddu, Elena; Firmani, Donatella; Merialdo, Paolo. - (2020), pp. 320-329. (Intervento presentato al  convegno 2019  INNS Big Data and Deep Learning conference tenutosi a Sestri Levante; Italy) [10.1007/978-3-030-16841-4_33].
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
Scardapane_Multikernel_2020.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 708.44 kB Formato Adobe PDF Contatta l'autore	708.44 kB	Adobe PDF	Contatta l'autore
Scardapane_preprint_Multikernel_2019.pdf accesso aperto Note: https://doi.org/10.1007/978-3-030-16841-4_33 Tipologia: Documento in Pre-print (manoscritto inviato all'editore, precedente alla peer review) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 1.88 MB Formato Adobe PDF	1.88 MB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1702194

Citazioni

ND

ND

ND

social impact